Search CORE

38 research outputs found

Text-mining and ontologies: new approaches to knowledge discovery of microbial diversity

Author: Bossy Robert
Chaix Estelle
Deléger Louise
Nédellec Claire
Publication venue
Publication date: 24/10/2017
Field of study

Microbiology research has access to a very large amount of public information on the habitats of microorganisms. Many areas of microbiology research uses this information, primarily in biodiversity studies. However the habitat information is expressed in unstructured natural language form, which hinders its exploitation at large-scale. It is very common for similar habitats to be described by different terms, which makes them hard to compare automatically, e.g. intestine and gut. The use of a common reference to standardize these habitat descriptions as claimed by (Ivana et al., 2010) is a necessity. We propose the ontology called OntoBiotope that we have been developing since 2010. The OntoBiotope ontology is in a formal machine-readable representation that enables indexing of information as well as conceptualization and reasoning.Comment: 5 page

arXiv.org e-Print Archive

HAL Descartes

Extracting lay paraphrases of specialized expressions from monolingual comparable medical corpora

Author: Louise Deléger
Pierre Zweigenbaum
Publication venue
Publication date: 01/01/2009
Field of study

Whereas multilingual comparable corpora have been used to identify translations of words or terms, monolingual corpora can help identify paraphrases. The present work addresses paraphrases found between two different discourse types: specialized and lay texts. We therefore built comparable corpora of specialized and lay texts in order to detect equivalent lay and specialized expressions. We identified two devices used in such paraphrases: nominalizations and neo-classical compounds. The results showed that the paraphrases had a good precision and that nominalizations were indeed relevant in the context of studying the differences between specialized and lay language. Neo-classical compounds were less conclusive. This study also demonstrates that simple paraphrase acquisition methods can also work on texts with a rather small degree of similarity, once similar text segments are detected

CiteSeerX

Crossref

Detecting negation of medical problems in French clinical notes

Author: Deléger Louise
Grouin Cyril
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceabstrac

Detecting negation of medical problems in French clinical notes

Author: Deléger Louise
Grouin Cyril
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceabstrac

HAL Descartes

Bacteria biotope annotation guidelines

Author: Bossy Robert
Deléger Louise
Publication venue
Publication date
Field of study

ProdInra

Extracting medical information from narrative patient records: the case of medication-related information

Author: Deléger Louise
Grouin Cyril
Zweigenbaum Pierre
Publication venue: BMJ Group
Publication date
Field of study

Crossref

PubMed Central

Design of an extensive information representation scheme for clinical narratives

Author: Campillos Leonardo
Deléger Louise
Ligozat Anne-Laure
Publication venue
Publication date: 01/01/2017
Field of study

Background: Knowledge representation frameworks are essential to the understanding of complex biomedical processes, and to the analysis of biomedical texts that describe them. Combined with natural language processing (NLP), they have the potential to contribute to retrospective studies by unlocking important phenotyping information contained in the narrative content of electronic health records (EHRs). This work aims to develop an extensive information representation scheme for clinical information contained in EHR narratives, and to support secondary use of EHR narrative data to answer clinical questions. Methods: We review recent work that proposed information representation schemes and applied them to the analysis of clinical narratives. We then propose a unifying scheme that supports the extraction of information to address a large variety of clinical questions. Results: We devised a new information representation scheme for clinical narratives that comprises 13 entities, 11 attributes and 37 relations. The associated annotation guidelines can be used to consistently apply the scheme to clinical narratives and are https://cabernet.limsi.fr/annotation_ guide_ for_ the_ merlot_ french_ clinical_corpus-Sept2016.pdf. Conclusion: The information scheme includes many elements of the major schemes described in the clinical natural language processing literature, as well as a uniquely detailed set of relations

ProdInra